linear svm
Computational Linguistics Meets Libyan Dialect: A Study on Dialect Identification
Essgaer, Mansour, Massud, Khamis, Mamlook, Rabia Al, Ghmaid, Najah
This study investigates logistic regression, linear support vector machine, multinomial Naive Bayes, and Bernoulli Naive Bayes for classifying Libyan dialect utterances gathered from Twitter. The dataset used is the QADI corpus, which consists of 540,000 sentences across 18 Arabic dialects. Preprocessing challenges include handling inconsistent orthographic variations and non-standard spellings typical of the Libyan dialect. The chi-square analysis revealed that certain features, such as email mentions and emotion indicators, were not significantly associated with dialect classification and were thus excluded from further analysis. Two main experiments were conducted: (1) evaluating the significance of meta-features extracted from the corpus using the chi-square test and (2) assessing classifier performance using different word and character n-gram representations. The classification experiments showed that Multinomial Naive Bayes (MNB) achieved the highest accuracy of 85.89% and an F1-score of 0.85741 when using a (1,2) word n-gram and (1,5) character n-gram representation. In contrast, Logistic Regression and Linear SVM exhibited slightly lower performance, with maximum accuracies of 84.41% and 84.73%, respectively. Additional evaluation metrics, including log loss, Cohen kappa, and Matthew correlation coefficient, further supported the effectiveness of MNB in this task. The results indicate that carefully selected n-gram representations and classification models play a crucial role in improving the accuracy of Libyan dialect identification. This study provides empirical benchmarks and insights for future research in Arabic dialect NLP applications.
- Asia > Malaysia (0.04)
- Africa > Middle East > Libya > Sabha District > Sabha (0.04)
- North America > United States > Michigan (0.04)
- (7 more...)
- Research Report > New Finding (1.00)
- Research Report > Experimental Study (1.00)
Supplement: Novel Upper Bounds for the Constrained Most Probable Explanation Task
For instance, a type of consistency constraint encodes the restriction that only entry from each function must be selected. A second type of consistency constraint ensures that if two functions share a variable then only entries which assign the shared variable to the same value are selected. The CMPE task adds a global constraint to the ILP formulation of MPE given in Eq. The LP-bound on the ILP given in Eq. This gives us the following linear programming relaxation.
Machine Learning-Based Prediction of Speech Arrest During Direct Cortical Stimulation Mapping
Emami, Nikasadat, Khalilian-Gourtani, Amirhossein, Qian, Jianghao, Ratouchniak, Antoine, Chen, Xupeng, Wang, Yao, Flinker, Adeen
Identifying cortical regions critical for speech is essential for safe brain surgery in or near language areas. While Electrical Stimulation Mapping (ESM) remains the clinical gold standard, it is invasive and time-consuming. To address this, we analyzed intracranial electrocorticographic (ECoG) data from 16 participants performing speech tasks and developed machine learning models to directly predict if the brain region underneath each ECoG electrode is critical. Ground truth labels indicating speech arrest were derived independently from Electrical Stimulation Mapping (ESM) and used to train classification models. Our framework integrates neural activity signals, anatomical region labels, and functional connectivity features to capture both local activity and network-level dynamics. We found that models combining region and connectivity features matched the performance of the full feature set, and outperformed models using either type alone. To classify each electrode, trial-level predictions were aggregated using an MLP applied to histogram-encoded scores. Our best-performing model, a trial-level RBF-kernel Support Vector Machine together with MLP-based aggregation, achieved strong accuracy on held-out participants (ROC-AUC: 0.87, PR-AUC: 0.57). These findings highlight the value of combining spatial and network information with non-linear modeling to improve functional mapping in presurgical evaluation.
Harmonized Gradient Descent for Class Imbalanced Data Stream Online Learning
Zhou, Han, Yin, Hongpeng, Deng, Xuanhong, Huang, Yuyu, Ren, Hao
Many real-world data are sequentially collected over time and often exhibit skewed class distributions, resulting in imbalanced data streams. While existing approaches have explored several strategies, such as resampling and reweighting, for imbalanced data stream learning, our work distinguishes itself by addressing the imbalance problem through training modification, particularly focusing on gradient descent techniques. We introduce the harmonized gradient descent (HGD) algorithm, which aims to equalize the norms of gradients across different classes. By ensuring the gradient norm balance, HGD mitigates under-fitting for minor classes and achieves balanced online learning. Notably, HGD operates in a streamlined implementation process, requiring no data-buffer, extra parameters, or prior knowledge, making it applicable to any learning models utilizing gradient descent for optimization. Theoretical analysis, based on a few common and mild assumptions, shows that HGD achieves a satisfied sub-linear regret bound. The proposed algorithm are compared with the commonly used online imbalance learning methods under several imbalanced data stream scenarios. Extensive experimental evaluations demonstrate the efficiency and effectiveness of HGD in learning imbalanced data streams.
- Asia > China > Chongqing Province > Chongqing (0.04)
- North America > United States (0.04)
- Europe > Portugal > Braga > Braga (0.04)
- Asia > China > Shanghai > Shanghai (0.04)
- Research Report (1.00)
- Instructional Material > Online (0.34)
- North America > United States > Texas (0.04)
- North America > United States > New York (0.04)
AI-driven Web Application for Early Detection of Sudden Death Syndrome (SDS) in Soybean Leaves Using Hyperspectral Images and Genetic Algorithm
Yadav, Pappu Kumar, Aggarwal, Rishik, Paudel, Supriya, Parmar, Amee, Mirzakhaninafchi, Hasan, Usmani, Zain Ul Abideen, Tchalla, Dhe Yeong, Solanki, Shyam, Mural, Ravi, Sharma, Sachin, Burks, Thomas F., Qin, Jianwei, Kim, Moon S.
Sudden Death Syndrome (SDS), caused by Fusarium virguliforme, poses a significant threat to soybean production. This study presents an AI-driven web application for early detection of SDS on soybean leaves using hyperspectral imaging, enabling diagnosis prior to visible symptom onset. Leaf samples from healthy and inoculated plants were scanned using a portable hyperspectral imaging system (398-1011 nm), and a Genetic Algorithm was employed to select five informative wavelengths (505.4, 563.7, 712.2, 812.9, and 908.4 nm) critical for discriminating infection status. These selected bands were fed into a lightweight Convolutional Neural Network (CNN) to extract spatial-spectral features, which were subsequently classified using ten classical machine learning models. Ensemble classifiers (Random Forest, AdaBoost), Linear SVM, and Neural Net achieved the highest accuracy (>98%) and minimal error across all folds, as confirmed by confusion matrices and cross-validation metrics. Poor performance by Gaussian Process and QDA highlighted their unsuitability for this dataset. The trained models were deployed within a web application that enables users to upload hyperspectral leaf images, visualize spectral profiles, and receive real-time classification results. This system supports rapid and accessible plant disease diagnostics, contributing to precision agriculture practices. Future work will expand the training dataset to encompass diverse genotypes, field conditions, and disease stages, and will extend the system for multiclass disease classification and broader crop applicability.
- North America > United States > Florida > Alachua County > Gainesville (0.14)
- North America > United States > South Dakota > Brookings County > Brookings (0.05)
- North America > United States > Iowa (0.04)
- (4 more...)
- Health & Medicine (1.00)
- Food & Agriculture > Agriculture (1.00)
- Government > Regional Government > North America Government > United States Government (0.68)
- Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Evolutionary Systems (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.89)
Gender Fairness of Machine Learning Algorithms for Pain Detection
Green, Dylan, Shang, Yuting, Cheong, Jiaee, Liu, Yang, Gunes, Hatice
-- Automated pain detection through machine learning (ML) and deep learning (DL) algorithms holds significant potential in healthcare, particularly for patients unable to self-report pain levels. However, the accuracy and fairness of these algorithms across different demographic groups (e.g., gender) remain under-researched. This paper investigates the gender fairness of ML and DL models trained on the UNBC-McMaster Shoulder Pain Expression Archive Database, evaluating the performance of various models in detecting pain based solely on the visual modality of participants' facial expressions. We compare traditional ML algorithms, Linear Support V ector Machine (L SVM) and Radial Basis Function SVM (RBF SVM), with DL methods, Convolutional Neural Network (CNN) and Vision Transformer (ViT), using a range of performance and fairness metrics. While ViT achieved the highest accuracy and a selection of fairness metrics, all models exhibited gender-based biases. These findings highlight the persistent trade-off between accuracy and fairness, emphasising the need for fairness-aware techniques to mitigate biases in automated healthcare systems. Machine Learning (ML) has become an essential tool in modern healthcare, offering the potential to automate complex tasks, such as pain detection, through images and videos [39]. However, as these technologies are adopted, ensuring fairness becomes critical to avoid perpetuating or exacerbating existing biases [79], [9], [73]. ML fairness refers to the absence of prejudice or bias in a machine learning system concerning sensitive attributes such as gender, race, or age [57]. In pain detection models, fairness ensures that individuals across different demographic groups are equally likely to be correctly classified.
- North America > United States (0.14)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
- Europe > Finland > Northern Ostrobothnia > Oulu (0.04)
- (2 more...)
- Research Report > New Finding (1.00)
- Research Report > Experimental Study (1.00)
- Health & Medicine > Therapeutic Area > Neurology (1.00)
- Health & Medicine > Therapeutic Area > Musculoskeletal (1.00)
- Information Technology > Artificial Intelligence > Vision > Face Recognition (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Support Vector Machines (0.69)
Learning Covariance-Based Multi-Scale Representation of Neuroimaging Measures for Alzheimer Classification
Baek, Seunghun, Choi, Injun, Dere, Mustafa, Kim, Minjeong, Wu, Guorong, Kim, Won Hwa
Stacking excessive layers in DNN results in highly underdetermined system when training samples are limited, which is very common in medical applications. In this regard, we present a framework capable of deriving an efficient high-dimensional space with reasonable increase in model size. This is done by utilizing a transform (i.e., convolution) that leverages scale-space theory with covariance structure. The overall model trains on this transform together with a downstream classifier (i.e., Fully Connected layer) to capture the optimal multi-scale representation of the original data which corresponds to task-specific components in a dual space. Experiments on neuroimaging measures from Alzheimer's Disease Neuroimaging Initiative (ADNI) study show that our model performs better and converges faster than conventional models even when the model size is significantly reduced. The trained model is made interpretable using gradient information over the multi-scale transform to delineate personalized AD-specific regions in the brain.
- North America > United States (0.14)
- Asia > South Korea (0.14)
Reviews: Variational Information Maximization for Feature Selection
There are several important issues that I believe must be addressed A. The authors make the following argument against current approaches. Current approaches are optimal under a pair of assumptions. These assumptions are hardly ever satisfied simultaneously. Therefore current approaches are flawed. There is a clear flawed logic here. The only thing shown is that current approaches are not optimal.
Reviews: Globally Optimal Training of Generalized Polynomial Neural Networks with Nonlinear Spectral Methods
This paper studied a particular class of feedforward neural networks that can be trained globally optimal with a linear convergence rate using nonlinear spectral method. This method was applied to deep networks with one- and two-hidden layers. Experiments were conducted on a series of real world datasets. As stated by authors, the class of feedforward neural networks studied is restrictive and counterintuitive by imposing the non-negativity on the weights of network and maximizing the regularization of these weights. Moreover, the less popular activation function called generalized polynomial is required for the optimality condition. All these assumptions are not quite reasonable.